Skip to main content

第 3 课:ReAct 架构(Reason + Act)

在上一课的 Decision Engine 中,我们其实已经给出一套很好的容错代码。

如果你仔细观察会发现里面存在两个 循环:

  • 第一个循环是处理 提问得到的回答和要求不符/存在错误

    当前策略是把当前错误信息添加到历史记录中,并重新提问。超过最大尝试次数后终止提问,并给出保守回答。

    在 函数 decide_with_retry 中:

    for attempt in range(max_retries + 1):  # 在有限的循环次数内,添加错误信息并尝试
    
            # 如果上次发送得到的回答 经过检查存在错误,本次对话中加上上次的报错信息。
            if err_msg:
                messages.append({
                    "role": "system",
                    "content": f"Your previous output failed validation: {err_msg}. "
                               f"Output ONLY one valid JSON object that matches the schema."
                })
    
            # 获取回答
            raw = call_llm(messages) 
    
            # 错误类型一:回答内容str 不能严格按照 json 格式
            try:
                obj = json.loads(raw) # 把 str 转化成 dict
            except Exception as e:
                err_msg = f"Invalid JSON parse error: {type(e).__name__}"
                continue
    
            # 错误类型二:回答内容存在逻辑错误,使用之前定义的函数检测
            ok, reason = validate_decision(obj, tools_available)
            if ok: # 如果成功,则返回回答的结果 dict。包含action,tool,tool_input, final
                return obj
            else: # 如果失败,添加错误原因到对话中
                err_msg = reason
    
            # 指数退避(简单版)
            time.sleep(0.4 * (2 ** attempt))
           
    
  • 第二个循环是处理 工具调用失败 的问题,但是其实没有用 for 循环写出来

    当前策略是把工具调用的错误信息写入 工具的调用结果 Observation 中后, 在函数 decide_with_retry 中添加到聊天记录中,然后重新提问一次(没有循环尝试)。相当于只允许一次工具调用失败,其实也可以改成最大允许失败次数。

    
    if decision1["action"] == "tool_call":  # 如果第一次得到的回答是”要用工具
        tool = decision1["tool"]            # 获取工具的名字
        tool_input = decision1["tool_input"]# 获取工具需要的输入
    
        # 获取工具执行的 返回(也就是 obeservation)
        try:
            if tool == "search":
                query = tool_input.get("query", "")
                obs = TOOLS["search"](query)
            else:
                raise RuntimeError("Unknown tool")
        except Exception as e:
            obs = f"[TOOL_ERROR] {type(e).__name__}: {e}"
    
    
        # 第二步:把 Observation 注入,要求模型基于 observation 输出 final
        decision2 = decide_with_retry(
            state={"goal": goal, "note": "Use the observation to answer."}, #注明使用 observation 回答
            tools_available=tools_available,
            last_observation=obs
        )
        print("Decision2:", json.dumps(decision2, ensure_ascii=False))
    else:
        # 不需要工具,直接 final
        print("Final:", decision1["final"])
    
    

(一)ReAct 结构

这种检查容错机制 其实就是本节课要讲的 ReAct 架构,能够处理 模型回答 和 工具调用的失败,不会死循环。

ReAct = Reason(推理) + Act(行动) + Observation(观察)

也就是说:

  • 每一步让 LLM 按照模板给出决策(Reason),并检查
  • 按照 LLM 的决策调用工具(Act),获取反馈(Observation)
  • 把 Observation 放入下一轮对话的记忆中。直到 finish
[State] → LLM(Decision) → [Action JSON]

Validator / Guardrails

Tool Executor / Error Handler

Observation

Update State


(二)容错机制

1. 模型返回检查(Validator)

raw = call_llm(...)
obj = json.loads(raw)
validate_decision(obj)

Validator 至少做三层校验:

  1. 结构层:字段是否齐全
  2. 类型层isinstance(...)
  3. 语义层
    • action 是否允许
    • tool 是否存在
    • action 与字段是否一致

2. 工具调用检查

工具失败 ≠ Agent 崩溃 工具失败 = 新的 Observation

try:
    obs = tool(...)
except Exception as e:
    obs = f"[TOOL_ERROR] {type(e).__name__}: {e}"

然后:

  • 把这个 obs 作为 Observation 喂回模型
  • 让模型决定:
    • 换工具
    • replan
    • ask_user
    • finish(降级)

千万不要做的事 ❌

  • 工具异常直接 raise
  • 不告诉模型失败原因
  • 悄悄重试无限次

3. 如何避免 Agent 无限循环

你至少要做 4 层防护

  • max_steps(硬上限)
for step in range(max_steps):
  • 动作约束(不能乱来)

    • 不允许 tool_call 无限重复

    • 连续 replan 次数限制

  • 终止状态变化检测

if state == last_state:
    force_replan()
  • 指数退避(你刚学过)

(三)思维链被截断的RePlan

有时候会出现这种典型表现:

  • 连续 tool_call 但结果无用
  • 重复同一个 action
  • 输出越来越短 / 空
  • Validator 连续失败

因此 ReAct 必须具备的“自救能力”。(Re-evaluate / Replan)

策略 1:显式 replan 动作

你已经允许了:

{"action": "replan"}

在 replan 时:

  • 清空 observation

  • 更新 state:

    state["note"] = "Previous approach failed. Try a new plan."
    

策略 2:强制 meta 提示

在 system 里加一条:

If you are stuck or repeating actions, choose "replan".

策略 3:外部强制切断

if repeated_actions > 2:
    return "Unable to proceed. Please clarify the goal."


(四)工具 schema

如果你仔细观察,会发现

现在这份代码里: 👉 并没有任何地方“规定” tool_search 的输入参数必须叫 query 👉 模型之所以输出了 {"query": ...},完全是“猜的 / 习惯性的 / 概率性行为” 👉 这在工程上是 ❌ 不安全、❌ 不可控、❌ 迟早出 bug 的

因此需要明确声明工具 schema(给模型看的)

1. 提示词模板

TOOL_SCHEMAS = {
    "search": {
        "description": "Search the web for information",
        "parameters": {
            "type": "object",
            "properties": {
                "query": {
                    "type": "string",
                    "description": "Search keywords"
                }
            },
            "required": ["query"]
        }
    }
}

把 schema 注入到 prompt(关键)

messages = [
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "system", "content": f"Tool schemas:\n{json.dumps(TOOL_SCHEMAS)}"},
    {"role": "user", "content": json.dumps({"state": state})}
]

这样模型才知道

  • search 这个工具
  • 必须提供 query
  • query 是 string
  • 少了就不合法

2. Validator 同步升级

if action == "tool_call":
    schema = TOOL_SCHEMAS[tool]["parameters"]
    required = schema["required"]
    for k in required:
        if k not in obj["tool_input"]:
            return False, f"Missing required tool_input field: {k}"

3. 完整代码

  1. Guardrails:动作空间 + 工具白名单 + 工具 schema
ALLOWED_ACTIONS = {"tool_call", "finish", "replan", "ask_user"}

TOOL_SCHEMAS = {
    "search": {
        "description": "Search the web for information. Use it when you need external facts.",
        "parameters": {
            "type": "object",
            "properties": {
                "query": {"type": "string", "description": "Search keywords"},
            },
            "required": ["query"],
            "additionalProperties": False
        }
    }
}

ALLOWED_TOOLS = set(TOOL_SCHEMAS.keys())
  1. 工具实现(先用“假搜索”演示;你可替换为真实 web search)
def tool_search(query: str) -> str:
    # Demo:这里返回固定内容,用于演示 Observation 注入
    # 你可以替换成真实搜索:SerpAPI / 自建爬虫 / 你的 web.run 等
    return f"[SEARCH_RESULT] query={query}\n- Agent = a system that uses an LLM to decide actions, can use tools, and updates state using observations."

TOOLS = {"search": tool_search}
  1. System Prompt:强约束
SYSTEM_PROMPT = f"""
You are an agent decision engine.

You MUST output exactly one JSON object. No markdown, no extra text.

Allowed actions: {sorted(ALLOWED_ACTIONS)}.
Allowed tools (only if action is tool_call): {sorted(ALLOWED_TOOLS)}.

Decision JSON Schema:
{{
  "action": "tool_call|finish|ask_user|replan",
  "tool": "string|null",
  "tool_input": "object|null",
  "final": "string|null"
}}

Rules (anti-hallucination):
- You MUST NOT fabricate any external facts.
- Tool results can ONLY come from an Observation provided by the system.
- If you need external info, choose action="tool_call" and specify tool/tool_input.
- When action="tool_call": final MUST be null.
- When action="finish": tool and tool_input MUST be null.
- If you are stuck or repeating, choose action="replan" or "ask_user".

Tool Schemas:
{json.dumps(TOOL_SCHEMAS, ensure_ascii=False)}
""".strip()
  1. LLM 调用
def call_llm(messages, model="gpt-4o", max_tokens=280, temperature=0.2) -> str:
    headers = {"Authorization": f"Bearer {API_KEY}", "Content-Type": "application/json"}
    payload = {
        "model": model,
        "messages": messages,
        "max_tokens": max_tokens,
        "temperature": temperature,
    }
    r = requests.post(CHAT_URL, headers=headers, json=payload, timeout=30)
    r.raise_for_status()
    return r.json()["choices"][0]["message"]["content"]
  1. Validators:严格校验 decision(结构 + 类型 + 语义 + tool_input schema)
def _validate_tool_input_schema(tool: str, tool_input: Dict[str, Any]) -> Tuple[bool, str]:
    schema = TOOL_SCHEMAS[tool]["parameters"]
    required = schema.get("required", [])
    props = schema.get("properties", {})
    additional = schema.get("additionalProperties", True)

    # required fields
    for k in required:
        if k not in tool_input:
            return False, f"Missing required tool_input field: {k}"

    # type checks (minimal)
    for k, v in tool_input.items():
        if k not in props:
            if additional is False:
                return False, f"Unexpected tool_input field: {k}"
            continue
        expected_type = props[k]["type"]
        if expected_type == "string" and not isinstance(v, str):
            return False, f"tool_input.{k} must be string"
        if expected_type == "object" and not isinstance(v, dict):
            return False, f"tool_input.{k} must be object"

    return True, "ok"


def validate_decision(obj: Dict[str, Any], tools_available: set) -> Tuple[bool, str]:
    # 必要字段
    for k in ("action", "tool", "tool_input", "final"):
        if k not in obj:
            return False, f"Missing key: {k}"

    # action 合法
    action = obj["action"]
    if action not in ALLOWED_ACTIONS:
        return False, f"Invalid action: {action}"

    # tool_call 语义一致性
    if action == "tool_call":
        tool = obj["tool"]
        if not isinstance(tool, str):
            return False, f"tool must be string when action=tool_call, got: {tool}"
        if tool not in tools_available:
            return False, f"Tool not allowed/available: {tool}"

        if not isinstance(obj["tool_input"], dict):
            return False, "tool_input must be an object when action=tool_call"

        ok, reason = _validate_tool_input_schema(tool, obj["tool_input"])
        if not ok:
            return False, reason

        if obj["final"] is not None:
            return False, "final must be null when action=tool_call"

    else:
        # 非 tool_call:tool/tool_input 必须为 null;final 必须为 str(ask_user/replan/finish 都给人类可读文本)
        if obj["tool"] is not None or obj["tool_input"] is not None:
            return False, "tool and tool_input must be null when action is not tool_call"
        if not isinstance(obj["final"], str):
            return False, "final must be a string when action is not tool_call"

    return True, "ok"
  1. decide_with_retry:解析失败/验证失败自动纠错重试
def decide_with_retry(
    state: Dict[str, Any],
    tools_available: set,
    last_observation: Optional[str] = None,
    max_retries: int = 2,
    model: str = "gpt-4o",
) -> Dict[str, Any]:

    base_messages = [
        {"role": "system", "content": SYSTEM_PROMPT},
        {"role": "user", "content": json.dumps({"state": state, "tools_available": sorted(tools_available)}, ensure_ascii=False)},
    ]

    if last_observation is not None:
        # 关键:Observation 用 system 注入,告诉模型这是唯一可信外部事实来源
        base_messages.append({"role": "system", "content": f"Observation:\n{last_observation}"})

    err_msg = None
    for attempt in range(max_retries + 1):
        messages = list(base_messages)
        if err_msg:
            messages.append({
                "role": "system",
                "content": f"Your previous output failed validation: {err_msg}. "
                           f"Output ONLY one valid JSON object that matches the schema."
            })

        raw = call_llm(messages, model=model, temperature=0.2, max_tokens=280)

        try:
            obj = json.loads(raw)
        except Exception as e:
            err_msg = f"Invalid JSON parse error: {type(e).__name__}"
            time.sleep(min(2.0, 0.4 * (2 ** attempt)) + random.random() * 0.1)
            continue

        ok, reason = validate_decision(obj, tools_available)
        if ok:
            return obj

        err_msg = reason
        time.sleep(min(2.0, 0.4 * (2 ** attempt)) + random.random() * 0.1)

    # 降级
    return {
        "action": "ask_user",
        "tool": None,
        "tool_input": None,
        "final": "I couldn't produce a valid tool/action plan. Please clarify your goal and constraints."
    }
  1. Agent Loop:模型自行决定调用工具次数与停止
def run_cot_tool_agent(
    goal: str,
    max_steps: int = 6,
    model: str = "gpt-4o",
) -> str:
    """
    关键点:
    - 每一轮:LLM 输出 decision JSON
    - 若 tool_call:执行工具,得到 Observation,再喂回 LLM
    - 若 finish:返回 final
    - 若 replan:更新 state/重置 observation,继续
    - 若 ask_user:直接返回 final
    """
    tools_available = set(TOOLS.keys())
    state = {"goal": goal}
    observation = None

    for step in range(max_steps):
        decision = decide_with_retry(
            state=state,
            tools_available=tools_available,
            last_observation=observation,
            max_retries=2,
            model=model,
        )

        action = decision["action"]

        if action == "finish":
            return decision["final"]

        if action == "ask_user":
            return decision["final"]

        if action == "replan":
            # 最简单 replan:在 state 上写一个 note,告诉模型换策略
            state["note"] = "Replan: change approach. If external facts needed, call a tool."
            observation = None
            continue

        if action == "tool_call":
            tool = decision["tool"]
            tool_input = decision["tool_input"]

            # 执行工具:把异常作为 Observation 返回给模型(不是 raise)
            try:
                obs = TOOLS[tool](**tool_input)
            except Exception as e:
                obs = f"[TOOL_ERROR] {type(e).__name__}: {e}"

            observation = obs
            continue

    return "Failed: exceeded max_steps. Please refine the goal."
if __name__ == "__main__":
    goal = "Explain what an agent is. If external facts are needed, use the search tool. Keep it concise."
    answer = run_cot_tool_agent(goal, max_steps=6, model="gpt-4o")
    print(answer)